Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data
نویسندگان
چکیده
We provide a unified optimization view of iterative Hessian sketch (IHS) and iterative dual random projection (IDRP). We establish a primal-dual connection between the Hessian sketch and dual random projection, and show that their iterative extensions are optimization processes with preconditioning. We develop accelerated versions of IHS and IDRP based on this insight together with conjugate gradient descent, and propose a primal-dual sketch method that simultaneously reduces the sample size and dimensionality.
منابع مشابه
Recovering the Optimal Solution by Dual Random Projection
Random projection has been widely used in data classification. It maps high-dimensional data into a low-dimensional subspace in order to reduce the computational cost in solving the related optimization problem. While previous studies are focused on analyzing the classification performance of using random projection, in this work, we consider the recovery problem, i.e., how to accurately recove...
متن کاملIMPROVED BIG BANG-BIG CRUNCH ALGORITHM FOR OPTIMAL DIMENSIONAL DESIGN OF STRUCTURAL WALLS SYSTEM
Among the different lateral force resisting systems, shear walls are of appropriate stiffness and hence are extensively employed in the design of high-rise structures. The architectural concerns regarding the safety of these structures have further widened the application of coupled shear walls. The present study investigated the optimal dimensional design of coupled shear walls based on the im...
متن کاملA Deterministic Analysis of Noisy Sparse Subspace Clustering for Dimensionality-reduced Data
Subspace clustering groups data into several lowrank subspaces. In this paper, we propose a theoretical framework to analyze a popular optimization-based algorithm, Sparse Subspace Clustering (SSC), when the data dimension is compressed via some random projection algorithms. We show SSC provably succeeds if the random projection is a subspace embedding, which includes random Gaussian projection...
متن کاملTowards Making High Dimensional Distance Metric Learning Practical
In this work, we study distance metric learning (DML) for high dimensional data. A typical approach for DML with high dimensional data is to perform the dimensionality reduction first before learning the distance metric. The main shortcoming of this approach is that it may result in a suboptimal solution due to the subspace removed by the dimensionality reduction method. In this work, we presen...
متن کاملSparse Learning for Large-Scale and High-Dimensional Data: A Randomized Convex-Concave Optimization Approach
In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer. Under the assumption that there exists a (approximately) sparse solution with high classification accuracy, we argue that the dual solution is also sparse or...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017